Genome Medicine
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
Severe combined immunodeficiency (SCID) is a heterogeneous, recessive disorder, associated with the onset of severe, recurrent infections in the first few months of life. SCID is fatal if left untreated, but outcomes can be significantly improved by prompt diagnosis and treatment, particularly prior to onset of infection. Consequently, SCID is already included in many newborn screening programmes around the world, as well as multiple international genomic newborn screening (gNBS) research progra...
Show abstract
Genetic diagnosis remains a formidable challenge characterized by a diagnostic odyssey that spans years, with over half of rare disease patients remaining undiagnosed affecting more than 300 million people on earth. Clinicians must navigate through thousands of candidate variants against a noisy and fragmented literature landscape, a task that overwhelms human cognitive capacity and conventional decision-making approaches. Recent advances in agentic artificial intelligence systems have demonstra...
Show abstract
MotivationFanconi anemia (FA) is a rare disease mainly caused by biallelic pathogenic variants, including structural variants such as large deletions and insertions in FA genes. Currently, variant detection is based on short-read sequencing and probe-based approaches. However, determining the exact genomic breakpoint or achieving allelic discrimination remains challenging. Nanopore-based long-read sequencing enables a comprehensive detection of FA variants, but a unified bioinformatic analysis p...
Show abstract
BackgroundMost rare coding variants in monogenic disease genes remain classified as Variants of Uncertain Significance (VUS), limiting their use in clinical care. Many variant classifications have been submitted to ClinVar, often with rich free-text summaries of the evidence underlying each classification. These narratives are not standardized and are difficult to mine systematically, making it challenging to identify variants that might be reclassified as new evidence becomes available. Method...
Show abstract
BackgroundVariation in the HLA loci, located on human chromosome 6p, has been associated with hundreds of diseases and conditions. However, high levels of polymorphism that characterize the HLA system, coupled with generally modest effect sizes for most phenotypes, necessitate relatively large sample sizes to power association studies; meanwhile, high resolution HLA genotyping remains relatively resource intensive. These constraints limit identification of novel associations. While phenome-wide ...
Show abstract
RNA sequencing (RNA-seq) provides a powerful complement to DNA sequencing for uncovering pathogenic defects affecting gene expression and splicing in individuals with genetically undiagnosed rare disorders. However, as large rare disease consortia adopt RNA-seq, challenges arise due to cohort heterogeneity, variability in tissues and sample sizes, and differences in interpretation practices. Here, we present a harmonized analytical and interpretation framework developed by the pan-European Solv...
Show abstract
The Clinical Pharmacogenetics Implementation Consortium (CPIC) bases its drug-gene recommendations on the assignment of star alleles, which map known genotypes to defined functional categories and corresponding drug dosage guidelines. The star allele framework, first proposed in 1996 for the CYP gene family and later formalized with CPICs establishment in 2010 [1, 2], remains foundational to pharmacogenomics. However, this system has notable limitations. Its dependence on a restricted set of ben...
Show abstract
BackgroundKlebsiella pneumoniae is a common cause of neonatal sepsis in Africa, and is frequently hospital acquired. We recently reported an outbreak of multidrug-resistant K. pneumoniae sepsis amongst neonates at a rural hospital in The Gambia, West Africa, involving 57 cases and case fatality of 60%. Here we undertook a retrospective pathogen genomic epidemiology study of clinical and environmental K. pneumoniae isolated during the outbreak, to identify the outbreak strain, refine the epidemic...
Show abstract
BACKGROUNDGenetic variant curation, an important step in the implementation of Genomic Medicine, requires literature-guided comparison of variant prevalence in affected individuals versus healthy controls. This evidence is categorized as the PS4 evidence code by the AMP/ACMG variant interpretation guidelines and its manual extraction is a major bottleneck in clinical variant curation. This study aimed to evaluate whether reasoning-capable large language models (LLMs) can support guideline-constr...
Show abstract
BackgroundMitochondrial diseases are the most common inherited metabolic disorders, characterized by pronounced clinical and genetic heterogeneity that complicates molecular diagnosis. Although DNA-based sequencing approaches have become standard in genetic testing, up to half of patients remain without a definitive diagnosis. RNA sequencing (RNA-seq) provides a complementary layer of evidence by revealing functional consequences of genetic variation, thereby improving diagnostic yield. Methods...
Show abstract
Structural variants (SVs) are a major source of genomic diversity and disease susceptibility; however, populations from the Middle East and North Africa (MENA) region remain critically underrepresented in global reference databases. We provide the first detailed catalogue of structural variation in 61 individuals from diverse MENA countries, using publicly available ultra-long Oxford Nanopore sequencing. A scalable and dual-reference alignment-based method (GRCh38 and T2T-CHM13) was employed to ...
Show abstract
Fanconi anemia (FA) is a rare genetic disorder of impaired DNA repair characterized by progressive bone marrow failure, congenital malformations, and cancer predisposition. Early identification of individuals with FA is critical for timely clinical management, yet phenotype-driven approaches to FA identification are hindered by inconsistencies in existing phenotypic profiles. We compared the Human Phenotype Ontology (HPO) annotations for FA in OMIM (215 terms across 22 complementation group entr...
Show abstract
Rare Mendelian disorders affect 300-400 million people globally. Although genetic testing has become widely adopted, gene-specific evidence for tailored variant interpretation remains scattered across resources. We present Gene Portals, a framework for gene-centered multimodal knowledge bases that co-localize expert-harmonized clinical data, functional assays, population variation, structural annotations and gene-specific ACMG/AMP specifications within a single resource. A modular interface inte...
Show abstract
BackgroundExome sequencing (ES) has become a key diagnostic tool for rare diseases (RDs). However, most evidence on ES performance comes from high-income countries and patients from European ancestry. In countries such as Chile, limited access to next generation sequencing amplifies health disparities and highlights the need to identify which patients are most likely to benefit from ES. MethodsThis study presents the second phase of the Chilean DECIPHERD project, in which we performed ES in a n...
Show abstract
ObjectiveTo evaluate the analytical and clinical performance of fetal fraction (FF) enriched genome-wide noninvasive prenatal testing (GW-NIPT) for detection of clinically relevant copy number variants (CNVs) down to 1 Mb. MethodsWe retrospectively analyzed 10,501 singleton pregnancies tested with FF enrichment-based GW-NIPT between August 2023 and July 2025. CNV analysis was performed using BinDel and WisecondorX. ResultsFF enrichment increased median FF to 24% (2.4-fold increase). Clinically...
Show abstract
We assessed the impact of plasma protein quantitative trait loci (pQTL) on therapeutic hypotheses backed by human genetic evidence. We show that pQTL-supported target-indication pairs were 4.7 times more likely to advance from Phase I to launch, compared to a 2.6-fold increase observed only with human genetic evidence. Moreover, pQTL-based enrichment was prominent in druggable protein families which had limited enrichment from human genetic evidence alone.
Show abstract
Mitochondria are semi-autonomous organelles whose generation and maintenance demand precise expression, processing, and assembly of >1,000 proteins encoded across two genomes. To explore this cooperativity, we performed multiomic analyses on >200 cell lines harboring mitochondrial gene perturbations, generating >26M molecular measurements. Our data reveal that mitochondrial proteome homeostasis is heavily influenced by post-transcriptional processes. Through nearest neighbor analyses, we reveal ...
Show abstract
Ewing sarcoma (EwS) is a rare, aggressive pediatric malignancy driven by FET::ETS family fusions (EWSR1::FLI1 in >85% of cases) with no established environmental risk factors. To investigate germline predisposition, we analyzed 2,014 EwS cases and 10,525 cancer-free controls in a two-stage analysis that combined an international genome-wide association study and a case{square}parent trio study. The combined meta-analysis identified 18 variants at 14 susceptibility loci (9 novel, 5 replicated) wi...
Show abstract
AbstractGenome-wide studies (GWAS) on asthma have identified nearly 200 genomic loci. However, the underlying mechanisms remain mostly elusive. While functional profiling of blood immune cell types has helped interpret asthma GWAS signals, high-resolution functional genomic data of lung immune cells, which differ from circulating immune cells, are lacking. We thus profiled single-cell multi-omics (RNA-seq and ATAC-seq) on lymphocytes of lung and spleen tissues from 9 donors. Cross-tissue compari...
Show abstract
Cohesin is a fundamental genome-organizing complex that orchestrates three-dimensional chromosome folding and gene expression via DNA loop extrusion. Alterations to genes encoding cohesin subunits and cohesin loaders cause Mendelian disorders, including Cornelia de Lange syndrome (CdLS). By contrast, disruption of factors that remove cohesin from DNA, including WAPL and its binding partners PDS5A and PDS5B, have not yet been associated with human disease. Here, we explored the relevance of these...